Rarity of Words in a Language and in a Corpus

نویسنده

  • Jaroslava Hlavácová
چکیده

$ VLPSOH PHWKRG ZDV SUHVHQWHG ODVW \HDU +ODYiþRYi 5\FKOê DOORZLQJ WR GLVWLQJXLVK DXWRPDWLFDOO\ EHWZHHQ UDUH DQG FRPPRQ ZRUGV KDYLQJ WKH VDPH IUHTXHQF\ LQ D ODQJXDJH FRUSXV 7KH PHWKRG RSHUDWHV ZLWK WZR QHZ WHUPV UHGXFHG IUHTXHQF\ DQG UDULW\ 7KH UDULW\ ZDV SURSRVHG DV D PHDVXUH RI ZRUG UDUHQHVV RU FRPPRQQHVV LQ D ODQJXDJH 7KLV DUWLFOH GHDOV ZLWK WKH UDULW\ D ELW PRUH GHHSO\ ,WV YDOXH ZDV FDOFXODWHG IRU VHYHUDO GLIIHUHQW FRUSRUD DQG FRPSDUHG 7ZR H[SHULPHQWV ZHUH GRQH RQ WKH UHDO GDWD WDNHQ IURP WKH &]HFK 1DWLRQDO &RUSXV 5HVXOWV RI WKH ILUVW RQH SURYH WKDW UHRUGHULQJ RI WH[WV LQ WKH FRUSXV GRHV QRW LQIOXHQFH WKH UDULW\ RI ZRUGV ZLWK D KLJK IUHTXHQF\ LQ WKH FRUSXV ,Q WKH VHFRQG H[SHULPHQW UDULW\ RI WKH VDPH ZRUGV LQ WZR FRUSRUD RI GLIIHUHQW VL]HV LV FRPSDUHG

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Vocabulary Lists for EAP and Conversation Students

Despite the abundance of research investigating general and academic vocabularies and developing dozens of word lists, few studies have compared academic vocabulary with general service word lists such as conversation vocabulary. Many EAP researchers assume that university students need to know all the words in West’s (1953) General Service List (GSL) as a prerequisite to academic words (e.g., ...

متن کامل

A Corpus-Based Study of the Lexical Make-up of Applied Linguistics Article Abstracts

This paper reports results from a corpus-based study that explored the frequency of words in the abstracts of applied linguistics journal articles. The abstracts of major articles in leading applied linguists journals, published since 2005 up to November 2001 were analyzed using software modules from the Compleat Lexical Tutor. The output includes a list of the most frequent content words, list...

متن کامل

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

پیکره اعلام: یک پیکره استاندارد واحدهای اسمی برای زبان فارسی

Named entity recognition (NER) is a natural language processing (NLP) problem that is mainly used for text summarization, data mining, data retrieval, question and answering, machine translation, and document classification systems. A NER system is tasked with determining the border of each named entity, recognizing its type and classifying it into predefined categories. The categories of named...

متن کامل

How textbooks (and learners) get it wrong: A corpus study of modal auxiliary verbs

Many  elements  contribute  to  the  relative  difficulty  in  acquiring  specific  aspects  of  English  as  a foreign  language  (Goldschneider  &  DeKeyser,  2001).  Modal  auxiliary  verbs  (e.g.  could,  might), are  examples  of  a  structure  that  is  difficult  for  many  learners.  Not  only  are  they  particularly complex  semantically,  but  especially  in  the  Malaysian  context ...

متن کامل

Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners

Hedges, as tools to express tentativeness and doubt, have been studied in plenty of research papers in the Iranian EFL research setting. However, their use in a learner corpus, portraying Iranian learner English, is in need of more research attention. With this end in view, this study aimed at investigating how Iranian EFL learners who have majored in English-related fields in Iran deployed hed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000